Speeding-Up Hoeffding-Based Regression Trees With Options
نویسندگان
چکیده
Data streams are ubiquitous and have in the last two decades become an important research topic. For their predictive nonparametric analysis, Hoeffding-based trees are often a method of choice, offering a possibility of any-time predictions. However, one of their main problems is the delay in learning progress due to the existence of equally discriminative attributes. Options are a natural way to deal with this problem. Option trees build upon regular trees by adding splitting options in the internal nodes. As such they are known to improve accuracy, stability and reduce ambiguity. In this paper, we present on-line option trees for faster learning on numerical data streams. Our results show that options improve the any-time performance of ordinary on-line regression trees, while preserving the interpretable structure of trees and without significantly increasing the computational complexity of the algorithm.
منابع مشابه
Online tree-based ensembles and option trees for regression on evolving data streams
The emergence of ubiquitous sources of streaming data has given rise to the popularity of algorithms for online machine learning. In that context, Hoeffding trees represent the state-of-the-art algorithms for online classification. Their popularity stems in large part from their ability to process large quantities of data with a speed that goes beyond the processing power of any other streaming...
متن کاملFast Perceptron Decision Tree Learning from Evolving Data Streams
Mining of data streams must balance three evaluation dimensions: accuracy, time and memory. Excellent accuracy on data streams has been obtained with Naive Bayes Hoeffding Trees—Hoeffding Trees with naive Bayes models at the leaf nodes—albeit with increased runtime compared to standard Hoeffding Trees. In this paper, we show that runtime can be reduced by replacing naive Bayes with perceptron c...
متن کاملImproving Adaptive Bagging Methods for Evolving Data Streams
We propose two new improvements for bagging methods on evolving data streams. Recently, two new variants of Bagging were proposed: ADWIN Bagging and Adaptive-Size Hoeffding Tree (ASHT) Bagging. ASHT Bagging uses trees of different sizes, and ADWIN Bagging uses ADWIN as a change detector to decide when to discard underperforming ensemble members. We improve ADWIN Bagging using Hoeffding Adaptive...
متن کاملPredictors of speeding among drivers based on Prototype Willingness Model
Background: Every year 1.2 millions of people are killed in road accident, and speeding is a major contributor road crashes among young driver. Accounting 40% of fatal crashes involved speeding. The purpose of this study was determining predictor of speeding intention among young driver 19-25 years old young driver in ghaemshahr based on Prototype Willingness Model. Materials and methods: I...
متن کاملAccurate Ensembles for Data Streams: Combining Restricted Hoeffding Trees using Stacking
The success of simple methods for classification shows that is is often not necessary to model complex attribute interactions to obtain good classification accuracy on practical problems. In this paper, we propose to exploit this phenomenon in the data stream context by building an ensemble of Hoeffding trees that are each limited to a small subset of attributes. In this way, each tree is restr...
متن کامل